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1. Introduction 



In this paper we study a Wigner matrix H - a random N x N matrix whose entries are independent up 
to symmetry constraints - that has been deformed by the addition of a finite-rank matrix A belonging to 
the same symmetry class as H. By Weyl's eigenvalue interlacing inequalities, such a deformation does not 
influence the global statistics of the eigenvalues as — oo. Thus, the empirical eigenvalue densities of 
the deformed matrix H + A and the undeformed matrix H have the same large-scale asymptotics, and are 
governed by Wigner's famous semicircle law. However, the behaviour of individual eigenvalues may change 
dramatically under such a deformation. In particular, deformed Wigner matrices may exhibit outliers - 
eigenvalues detached from the bulk spectrum. They were first investigated in 
deformation. Subsequently, much progress 
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. We refer to [21 23 
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20 for a particular rank-one 



developments. 

We normalize H so that its spectrum is asymptotically given by the interval [—2,2]. The creation of 
an outlier is associated with a sharp transition, where the magnitude of an eigenvalue di of A exceeds the 
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threshold 1. As di (respectively —di) becomes larger than 1, the largest (respectively smallest) non-outlier 
eigenvalue oi H + A detaches itself from the bulk spectrum and becomes an outlier. This transition is 
conjectured to take place on the scale \di\ — 1 ^ In fact, this scale was established in [l]|6||7 22 for 

the special cases where H is Gaussian - the Gaussian Orthogonal Ensemble (GOE) and the Gaussian Unitary 



Ensemble (GUE). We sketch the results of [T||6)[7)|22] in the case of additive deformations of GOE/GUE. For 
simplicity, we consider rank-one deformations, although the results of [l|[6|[7||22] cover arbitrary finite-rank 
deformations. Let the eigenvalue d of ^4 be of the form d = 1 -I- wN~^^^ for some fixed w G M. In [l][6{[7}[22) , 
the authors proved for any fixed w the weak convergence 



N'^^Xj,{H + A) 



where \n{H + A) denotes the largest eigenvalue of H + A. In particular, the largest eigenvalue of H + A 
fluctuates on the scale A^^^/"^. Moreover, the asymptotics in w of the law A^ was analysed in [l||5|j7 22 
w — >■ -\-(X) (and after an appropriate affine scaling), the law A^, converges to a Gaussian; as w ^ —oo 
law A^ converges to the Tracy- Widom-/3 distribution (where /3 = 1 for GOE and /3 = 2 for GUE), which 
famously governs the distribution of the largest eigenvalue of the underformed matrix H [27[[2F 



as 
the 



The proofs of [TJ[22 use an asymptotic analysis of Fredholm determinants, while those of [5j|7| use an 
explicit tridiagonal representation of H; both of these approaches rely heavily on the Gaussian nature of 
H. In order to study the phase transition for non-Gaussian matrix ensembles, and in particular address the 
question of spectral universality, a different approach is needed. Interestingly, it was observed in [8-10 that 
the distribution of the outliers is not universal, and may depend on the law of H as well as the geometry of 



the eigenvectors of A. The non-universality of the outliers was further investigated in 21 23 24 
In a recent paper 



21 , we considered finite-rank deformations of a Wigner matrix whose entries have 



subexponential decay. The two main results of |21| may be informally summarized as follows. 

(a) We proved that the non-outliers of H+A stick to the extremal eigenvalues of the original Wigner matrix 

with high precision, provided that each eigenvalue of ^ satisfies — 1| ^ (\og ]S[)'^^°s^°s^ ^ 

(b) We identified the asymptotic distribution of a single outlier, provided that (i) it is separated from the 
asymptotic bulk spectrum [—2,2] by at least (log A^)'-^'°siogJV^-2/3^ ^^j-^^^ ^y-j does not overlap with 
any other outlier of H + A. Here, two outliers are said to overlap if their separation is comparable to 
the scale on which they fluctuate; see Section [2?2] below for a precise definition. 



Note that the assumption (i) of (b) is optimal, up to the logarithmic factor {log N)^ ^ . Indeed, the 
extremal bulk eigenvalues of H + A are known 
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Theorem 2.7] to fluctuate on the scale N~^^^; for an 
eigenvalue of H+A to be an outlier, therefore, we requ ire that its distance from the asymptotic bulk spectrum 
[—2,2] be much greater than N^^^"^. See Section |2.2| below for more details. 

The goal of this paper is to extend the result (b) by obtaining a complete description of the asymptotic 
distribution of the outliers. Our only assumptions on the deformation A — A^ are that its rank be flxed and 
its norm bounded. (In particular, the eigenvalues of A may depend on N in an arbitrary fashion, provided 
they remain bounded, and its eigenvectors may be an arbitrary orthonormal family.) Our main result gives 
the asymptotic joint distribution of all outliers. Here, an outlier is by deflnition an eigenvalue of H + A 

-2 2] by at least 



whose classical location (see (2.5) below) is separated from the asymptotic bulk spectrum 



2.11 



below. 



(log Af)'^'°siog Af^-s/a gome (large) constant C. Our main result is given in Theorem 

Thus, in this paper we extend the result (b) in two directions: we allow overlapping outliers, and we derive 
the joint asymptotic distribution of all outliers. The distribution of overlapping outliers is more complicated 
than that of non-overlapping outliers, as overlapping outliers exhibit a level repulsion similar to that among 
the bulk eigenvalues of Wigner matrices. This repulsion manifests itself by the joint distribution of a group 
of overlapping outliers being given by the distribution of eigenvalues of a small (explicit) random matrix 
(see (2.15) below). The mechanism underlying the repulsion among outliers is therefore the same as that 
for the eigenvalues of GUE: the Jacobian relating the eigenvalue-eigenvector entries to the matrix entries 
has a Vandermonde determinant structure, and vanishes if two eigenvalues coincide. Several special cases of 
overlapping outliers have already been studied in the works [8||l0j[23j|24] , which in particular exhibited the 
level repulsion mechanism described above. 

Due to this level repulsion, overlapping outliers are obviously not asymptotically independent. A novel 
observation, which follows from our main result, is that in general non-overlapping outliers are not asymp- 
totically independent either; in this case the lack of independence does not arise from level repulsion, but 
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from a more subtle interplay between the distribution of H and the geometry of the eigenvectors of A. In 
some special cases, such as GOE/GUE, non-overlapping outliers are, however, asymptotically independent. 
More precisely, our main result (Theorem 2.11 below) shows that two outliers may, under suitable conditions 
on H and A, be strongly correlated in the limit N — cx), even if they are far from each other (for instance 
on opposite sides of the bulk spectrum). 

Finally, we note that throughout this paper we assume that the entries of H have subexponential decay. 
We need this assumption because our proof relies heavily on the local semicircle law and eigenvalue rigidity 



estimates for H, proved in 17 under the assumption of subexponential decay. However, this assumption is 



not fundamental to our approach, which may be combined with the recent methods for dealing with heavy- 
tailed Wigner matrices developed in 11 12 29 . Moreover, the assumption that the norm of A be bounded 



may be easily removed; in fact, large eigenvalues of A are easier to treat than small ones. 



We remark that recently Pizzo, Renfrew, and Soshnikov 23 24 took a different approach, and derived 



the asymptotic distribution of a single group of overlapping outliers under optimal tail assumptions on H. 



On the other hand, in 23 24 it is assumed that the eigenvalues of A are independent of N and that its 



eigenvectors satisfy a condition which roughly constrains them to be either strongly localized or delocalized. 



21 



our proof relies on the isotropic local semicircle law, proved in 
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1.1. Outline of the proof. As in 

Theorems 2.2 and 2.3]. The isotropic local semicircle law is an extension of the local semicircle law, whose 
study was initiated in 14 15 . The local semicircle law has since become a cornerstone of random matrix 
theory, in particular in establishing the universality of Wigner matrices 13 16 ■ 18 25 26 . The strongest 



versions of the local semicircle law, proved in 11 17 
down to scales containing eigenvalues. In fact, as formulated in 17 
high-probability estimates on the quantity 



give precise estimates on the local eigenvalue density, 
the local semicircle law gives optimal 



Gij{z) - 5ijm{z) , 



(1.1) 



where m{z) denotes the Stieltjes transform of Wigner's semicircle law and G{z) := {H — z) ^ is the resolvent 
of H. 

The isotropic local semicircle law is a generalization of the local semicircle law, in that it gives optimal 
high-probability estimates on the quantity 



(v,(G(z)-TO(z)l)w), 



(1.2) 



where v and w are arbitrary deterministic vectors 

setting V = and w = e^-, where denotes i-th standard basis vector of 



Clearly, is a special case obtained from (1.2) by 

N 



As in the works 21 23 24 , a major part of our proof consists in deriving the asymptotic distribution of 
the entries of G{z). The main technical achievement of this paper is to obtain the joint asymptotics of an 
arbitrary finite family of variables of the form (v,G(z)w), whereby the spectral parameters z of dilferent 
entries may differ, and are assumed to satisfy 2 + (logiV)'^'°s'°s^7V~^/^ ^ |Rez| ^ C for some positive 
constant C . The question of the joint asymptotics of the resolvent entries occurs more generally in several 
problems on deformed random matrix models, and we therefore believe that the techniques of this paper are 
also of interest for other problems on deformed matrix ensembles. 

An important ingredient in our proof is the four-step strategy introduced in 21 . It may be summarised 
as follows: (i) reduction to the distribution of the resolvent of G, (ii) the case of Gaussian H , (iii) the case 
of almost Gaussian H, (iv) the case of general H . Steps (i)-(iii) in the current paper are substantially 
different from their counterparts in ||21;; this results from treating an entire overlapping group of outliers 
simultaneously, as well as from the need to develop an argument that admits an analysis of the joint law of 
different groups. In fact, for pedagogical reasons, first - in Sec tions [4}|7| - we give the proof for the case of a 
single group of overlapping outlier^ and then - in Section 9.1 - extend it to yield the full joint distribution. 
In contrast to the steps (i)-(iii), step (iv) survives almost unchanged from 21 , and in Section [t] we give an 
explanation of the required modifications. 

Another ingredient of our proof is a two-level partitioning of the outliers combined with near-degenerate 
perturbation theory for eigenvalues. Roughly, outliers are partitioned into blocks depending on whether 
they overlap. In the finer partition, denoted by 11 below (see Definition 2.101, we regroup two outliers into 



^In the resolvent language, this means that the spectral parameters z of all the resolvent entries coincide. 
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the same block if their mean separation is some large constant (denoted by s below) times the magnitude 
of their fluctuations. Due to logarithmic error factors of the form (logiV)'^'°8;iogJV ^Yiai appear naturally 
in high-probability estimates pervading our proof, we shall require a second, coarser, partition, denoted by 
r below (see Definition 9.1). In T, we regroup two outliers into the same block if their mean separation 
is (log A'')'-^'°siogJV y]-|2gg magnitude of their fluctuations. The link between F and 11 is provided by 
perturbation theory, and is performed in Sections |8] (for a single group) and [9] (for the full joint distribution). 



2. Formulation of results 



2.1. The setup. Let H ~ he an N x N random matrix. We assume that the upper-triangular 

entries {hij : i ^ j) are independent complex-valued random variables. The remaining entries of H are given 
by imposing H — H* . Here H* denotes the Hermitian conjugate of H. We assume that all entries are 
centred, Ehij = 0. In addition, we assume that one of the two following conditions holds. 

(i) Real symmetric Wigner matrix: hij G M for all i,j and 

(ii) Complex Hermitian Wigner matriT. 

e4 = nK,? = ^, - {i^j). 

We introduce the usual index /3 of random matrix theory, defined to be 1 in the real symmetric case and 2 
in the complex Hermitian case. We use the abbreviation GOE/GUE to mean GOE if _ff is a real symmetric 
Wigner matrix with Gaussian entries and GUE if iJ is a complex Hermitian Wigner matrix with Gaussian 
entries. We assume that the entries of H have uniformly subexponential decay, i.e. that there exists a 
constant ■!? > such that 

P(y]V|/iij| > x) < z9-iexp(-a;'') (2.1) 

for all i, j, and N . Note that we do not assume the entries of H to be identically distributed, and we do not 
require any smoothness in the distribution of the entries of H . 

We consider a deformation of fixed, finite rank r S N. Let V = V/v be a deterministic N x r matrix 
satisfying V*V — 1^, and _D = D^r be a deterministic r x r diagonal matrix whose eigenvalues are nonzero. 
Both V and D depend on TV. We sometimes also use the notation V = [v(^\ . . . , v^''^], where v(^\ . . . , v^*"^ g 
are orthonormal, as well as Z? = diag(c?i, . . . , dr). We always assume that the eigenvalues of D satisfy 

-S + 1 ^ di ^ d2 < ••• < dr s$ S-1, (2.2) 

where S is some fixed positive constant. We are interested in the spectrum of the deformed matrix 

r 

H := H + VDV* = + ^d,v(')(vW)* . 

1=1 

The following definition summarizes our conventions for the spectrum of a matrix. For our purposes it 
is important to allow the matrix entries and its eigenvalues to be indexed by an arbitrary subset of positive 
integers. 

Definition 2.1. Let n be a finite set of positive integers, and let A = {Aij)ij^.„ be a \tt\ x |7r| Hermitian 
matrix whose entries are indexed by elements ofn. We denote by 

a{A) := iXM)her. e 

the family of eigenvalues of A. We always order the eigenvalues so that Xi{A) ^ Xj{A) if i ^ j . 

By a slight abuse of notation, we sometimes identify a{A) with the set {Ai(A)}ig^ C M. Thus, for 
instance, dist((T(yl), (t(_B)) mini.j|Ai(A) — Xj{B)\ denotes the distance between a{A) and ff{B) viewed as 
subsets ofR. 



4 



We abbreviate the (random) eigenvalues of H and H by 



A. 



A„(i/): 



^J■o 



A„(i/), 



The following definition introduces a convenient notation for minors of matrices. 

Definition 2.2 (Minors). For an r x r matrix A — {Aij)^ ^^-^ and a subset tt C {1, . . . , r} of integers, we 
define the |7r| x |7r| matrix 

^[tt] — {Aij)i,jeTr ■ 

We shall frequently make use of the logarithmic control parameter 

ip = ipN -.^ (logiV)'°s'°sA^. (2.3) 

The interpretation of (p is that of a slowly growing parameter (note that ip ^ for any e and large enough 
N ^ Nq^e)). Throughout this paper, every quantity that is not explicitly a constant may depend on N, 
with the sole exception of the rank r of the deformation which is required to be fixed. Unless needed, we 
consistently drop the argument N from such quantities. 

We denote by C a generic positive large constant, whose value may change from one expression to the 
next. For two positive quantities A]\[ and _Bjv we use the notation Ap^ ^ Bj^ to mean C^^A^v ^ -Bw ^ CA^v 
for some positive constant C. Moreover, we write A^ <C if Aff/Bi^ 0, and A^ ^ -Bjv if -Bat <C A^f. 
Finally, for a < 6 we set |a, bj := [a, b] n Z. 

2.2. Heuristics of outliers. Before stating our results, we give a heuristic description of the behaviour of the 
outliers. An eigenvalue di of D satisfying 



(2.4) 



gives rise to an outlier ^J,a(i) located around its classical location 9{di), where we defined, for d e M \ (—1, 1), 



e{d) 



and 



a{i) 



i if < 

N -r + i if d, > . 



The condition (2.4 1 may be heuristically understood as follows; for simplicity set r 



(2.5) 
(2.6) 

1 and D = d> 1. 
Theorem 2.7]), 



21 



The extremal eigenvalues of H that are not outliers fluctuate on the scale N (see 
the same scale as the extremal eigenvalues of the undeformed matrix H . For the largest eigenvalue /ijy of H 
to be an outlier, we require that its separation from the asymptotic bulk spectrum [—2,2], which is of the 
order 9{d) — 2, be much greater than N~^^^. This leads to the condition (2.4) by a simple expansion of 6 
around 1. 

The outlier iia{i) associated with di fluctuates on the scale N~^^'^{\di\ — 1)^/^. Thus, iia(i) fluctuates on 
the scale N^^^"^ if di is well-separated from the critical point 1, and on the scale A^^^/a jg Q-jtical, i.e. 
di = 1 + aN^^/^ for some fixed a > 0. The outliers associated with di and dj overlap if their separation is 
comparable to or less than the scale on which they fluctuate. The overlapping condition thus reads 

\e{di) - 6{d,)\ < C7V-i/2(|d,|- 1)1/2. (2.7) 

for some (typically large) constant C > 0. Note that the factor \di\ — 1 on the right-hand side could be 
replaced with \dj\ — 1. Indeed, recalling (2.4), it is not hard to check that (2.7) for some C > is equivalent 
to (2.7) with di on the right-hand side replaced with dj and the constant C replaced with a constant C x C. 



Using (2.5) and recalling (2.4), we may rewrite the overlapping condition (2.7) as 



N^'Wd.,\~iYi^\d,- 



^ c 



(2.8) 



for some C > 0. As in ([2J|, \d., 
of outliers. 



1 may be replaced with jdj | — 1. Figure 2.1 summarizes the general picture 
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M 1 — \ 1— H 1 1— I 1 

9{di) ^2 2 



Figure 2.1: A general outlier configuration. We draw the outlier Ha{i) associated with di using a black line 
marking its mean location 0{di) and a grey curve indicating its probability density. The breadth of the curve 
associated with di is of the order N~-^/^{\di\ — 1)^/^. Outliers whose probability densities overlap satisfy 



(2.7) (or, equivalently, (2.8)). We do not draw the bulk eigenvalues, which are contained in the grey bar. 



2.3. The distribution of a single group. After these preparation, we state our results. We begin by defining 
a reference matrix which will describe the distribution of a group of overlapping outliers. Define the moment 
matrices /x'^^ = {fJ.[f) and /z^^^ = {p-[f) of H through 

Using the matrices /i*^^^ and ^'^^^ we define the deterministic functions 

'Pij,kl{R) ■= RilRkj + l(/3 = ^)RikRjl 
a.b 
a,h 

where k,l £ |1, r|, i? is an r x r matrix, and V an N x r matrix. Moreover, we define the deterministic 
r X r matrix 



Remark 2.3. Using Cauchy-Schwarz and the assumption (2.1), it is easy to check that V(y*V), Q{V), 
TZ{V), and S{V) are uniformly bounded for V satisfying ^ V*V ^ 1 (in the sense of quadratic forms). 

Next, let S = Sn be sl positive sequence satisfying Lp~^ ^ S ^ 1. (Our result will be independent of S 



provided it satisfies this condition; see Remark 2.4 below.) The sequence S will serve as a cutoff in the size of 
the entries of V when computing the law of V* HV: entries of V smaller than S give rise to an asymptotically 
Gaussian random variable by the Central Limit Theorem; the remaining entries are treated separately, and 
the associated random variable is in general not Gaussian. Thus, we define the matrix Vs = (Vfj) through 

V,,1{\V,,\>S). 

For ^ e [l,''] satisfying \dg\ > 1 we define the r x r matrix 

T',MKKi„K|-i,.'»(^qp + ^), («) 

Abbreviate 

A,j.ki ■■= P»j,fci(l) = SuSkj + l(/3 = mkSji . (2.10) 
Note that A is nothing but the covariance matrix of a GOE/GUE matrix: if r^^/^$ is an r x r GOE/GUE 
matrix then E$jj$j.; = ^ij,ki- We introduce an r x r Gaussian matrix independent of H, which is 
complex Hermitian for /3 — 2 and real symmetric for /3 = 1. The entries of are centred, and their law is 
determined by the covariance 

E*f.*L = ^A,,.. + i\d,\ + in\d,\ - 1) (- '''^•-"^P^^ + + + E.,,, . (2.11) 

Here Eij^ki •= 'P^^^ij.ki is a term that is needed to ensure that the right-hand side of (2.11 ) is a nonnegative 



x matrix. This nonnegativity follows as a by-product of our proof, in which the right-hand side of 



(2.11) is obtained from the covariance of an explicit random matrix; see Proposition 6.1 below for more 
details. Note that the term Eij^ki does not infiuence the asymptotic distribution of 
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Remark 2.4. A different choice of 6, subject to (p^^ ^6^1, leads to the same asymptotic distribution for 
+ This is an easy consequence of the Central Limit Theorem and the observation that the matrix 
entries 



have covariance matrix {\de\ + iy{\de\ — l)c?^ 14). 



Before stating our result in full generality, we give a special case which captures its essence and whose 
statement is somewhat simpler. 

Theorem 2.5. For large enough K the following holds. Let tt C be a subset of consecutive integers, 

and fix £ £ TT. Suppose that \de\ ^ 1 + Lp^ N^^/^ . Suppose moreover that there is a constant C such that 



for all i £ TT and, as N ^ oo, 



iVi/2(|d,|-l)i/2|d,~d,| < C 
N'/'{\d,\-l)'/^\d,-d,\ oo 



for all i e |l,r] \ tt. 

Define the resettled eigenvalues — {Ci)ieTr through 



(2.12) 
(2.13) 

(2.14) 



where we recall the definition (2.6) ofa{i). Let ^ = {S,i)ie7T denote the eigenvalues of the random \7r\ x |7r| 
matrix 

T[^] + *f,j + N'/\\d,\ - l)i/2(M,| + l)idi' - Dp^) . (2.15) 
Then for any bounded and continuous function f we have 

lim(E/(C)-E/(^)) = 0. 



The subset tt indexes outliers that belong to the same group of overlapping outliers, as required by (2.12) 
(see also (2.8) in the preceding discussion). As required by (2.13), the remaining outliers do not overlap with 
the outliers indexed by tt. 



Remark 2.6. The reference point i for the block tt is arbitrary and unimportant. See Lemma 4.6 below and 
the comment preceding it for a more detailed discussion. 

Remark 2.7. For the special case tt — {£}, Theorem 2.5 essentialljj^ reduces to Theorem 2.14 of 21 . In 



21 



21 



, where the variance 
the term Vg*HVs in 



addition. Theorem 2.5 corrects a minor issue in the statement of Theorem 2.14 of 
of T was not necessarily positive. Indeed, in the language of the current paper, in 
(2.9) was of the form V* HV , which amounted to transferring a large Gaussian component from 5* to T. 
This transfer was ill-advised as it sometimes resulted in a negative variance for ^I' (which would however be 
compensated in the sum T + ^ by a large asymptotically Gaussian component in T). 



The functions V, Q, 7?., and S in (2.9) and (2.11) are in general nonzero in the limit N — > oo. They 
encode the non-universality of the distribution of the outliers. Thus, the distribution of the outliers may 
depend on the law of the entries of H as well as on the geometry of the eigenvectors V . 

In the GOE/GUE case it is easy to check that T^ + ^f^ is asymptotically Gaussian with covariance matrix 



\dt 



1 



A, 



(2.16) 



Moreover, if lim^vld^l = 1 then the matrix + ^'^ converges weakly to a Gaussian matrix with covariance 
given by (2.16). In this case, therefore, the non-universality is washed out. Thus, only outliers separated 
from the bulk spectrum [—2, 2] by a distance of order one may exhibit non-universality. 



In fact, the condition of [21| analogous to | |2.13| l, Equation (2.24) in [2l| , is slightly stronger than ( |2.13| 
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If liniAT maxi_j|yi^ | = then an appropriate choice of S yields — {\di\ + l){\di\ — l)^^'^dj'^S{V) as well 
as a matrix whose covariance is asymptotically that of the GOE/GUE case, i.e. (2.16). Hence in this 
case the only manifestation of non-universality is the deterministic shift given by T^. 

It is possible to find scenarios in which each term of (2.9 1 and (2.11 ) (apart from the trivial error term E 
in (2.11 1) contributes in the limit N ^ oo. This is for instance the case if /xl^-* and /i'^-* do not depend on i 

and j, /xl^-* is not asymptotically 4-/3, and an eigenvector v^^'^ satisfies Hv*^*) ||oo c as well as ||v(') ||i ^ cN^^'^ 
for some constant c > 0. We refer to 21 Remarks 2.17 - 2.21] for analogous remarks, where more details 
are given for the case tt = {£}. 

Next, we give the asymptotic distribution of a group of overlapping outliers in full generality. Thus, 
Theorem 2.9 below holds for arbitrary sequences V = Vn and D = satisfying V*V = 1 and (2.2). 

Definition 2.8. Let N and D be given. For s > and £ e |l,r] satisfying \di\ > 1, define tt{£,s) = 
7r]\j ]j{£,s) as the smallest subset o/|l,r] with the two following properties. 

(i) £eTT{£,s). 

(a) If for i,j e |l,r] we have \di\ > 1 and 

iVi/2(|d,|-l)i/2|d,-d^.| <^ s, (2.17) 

then either i,j G Tr{£, s) or i,j € |1, r] \ 7r(£, s). 

The subset Tr{£, s) indexes those outliers that belong to the same group of overlapping outliers as £, where 
s is a cutoff distance used to determine whether two outliers are considered overlapping. Note that tt{£, s) is 
a set of consecutive integers. 

Theorem 2.9. For large enough K the following holds. Let e > be arbitrary, and let fi, . . . , fr be bounded 
continuous functions, where fk is a function on M'^' . Then there exist Nq G N and sq > such that for all 
N ^ A^o o.i^d s ^ So the following holds. 
Suppose that £ € [1, f] satisfies 

\de\ ^ l + <f''N-^/\ (2.18) 



and set tt := Tr{£, s). Then 



E/|.|(C)-E/|.|(|) 



(2.19) 



where C ctnd ^ were defined Theorem 2. 5 



2.4. The joint distribution. In order to describe the joint distribution of all outliers, we organize them into 
groups of overlapping outliers, using a partition 11 whose blocks tt are defined using the subsets tt{£, s) from 
Definition [m 



Definition 2.10. Let N and D be given, and fix K > and s > 0. We introduce a partitioi^ U = 
Il{N,K,s,D) on a subset o/|l,r], defined as 

n := {7r(£, s):£e [1, rj , ^ 1 + if'^N-'^''} . 

We also use the notation 11 — {vrj^gn o,i^d [11] :— IJ^gn 

The indices in [11] give rise to outliers, which are grouped into the blocks of 11. Indices in |l,r] \ [11] do 
not give rise to outliers. 
For TT G n we define 

d^r := min{di : i e n} . (2.20) 
We chose this value for definiteness, although any other choice of di with i G tt would do equally well. 



^That n is a partition follows from the observation that £' G n{£, s) if and only if £ £ Tt{£' , s). Therefore if £ and £' satisfy 
\di\^l + (p-ff Ar-2/3 and \de'\^l + (^^iV-^/s then either ■k{£, s) = n^i', s) or s) n n{e' , s) = 0. 
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Next, in analogy to (2.151, we define a |[n]| x |[n]| reference matrix whose eigenvalues will have the 
same asymptotic distribution as the appropriately rescaled outliers (A*Q(i))ie[n]- Define the block diagonal 
|[n]| X |[n]| matrix T = e^^n T", where 



In addition, we introduce a Hermitian, Gaussian | [H] | x | [H] | matrix ^E* that is independent of H and whose 
entries have mean zero. It is block diagonal, 4' = ®^gn ^ where the block = (^^)ijg7r is a |7r| x |7r| 
matrix. The law of ^E* is determined by the covariance 



n 



dl 



^tttt' ^ij ,kl ^tttt' -^ij.kl 



(M,|-i)V^(|d,| + i) 



d-jrf d-jT 



(2.21) 



where we defined 



(Note that Qij^u — yVij.u + yVki,ij-) As in (2.11), the factor Eij^ki = 'P~^'^ij,ki, whose contribution vanishes 
in the limit N ^ oo, simply ensures that the right-hand side of (|2.21 ) defines a nonnegative matrix; this 
nonnegativity is an immediate corollary of our proof in Section [9. l[ 

Next, in analogy to (2.14), we introduce the rescaled family of outliers C — iCi : tt G 11, i e tt) e M}^^ 
whose entries are defined by 



C := iVV2(|d^|_l)-l/2(^^(^^_0(d^)) 



(2.22) 



where we recall the definition (2.6) of a{i). Moreover, for tt e 11 let ~ : i G tt) denote the eigenvalues 
of the random |7r| x |7r| matrix 

and write ^ = {^^ : tt e 11) (^f : tt £ 11 , i e tt) G . We may now state our main result in its greatest 
generality. 

Theorem 2.11. For large enough K the following holds. Let e > be arbitrary, and let fi, . . . , fr be bounded 
continuous functions, where fk is a function on M.'' . Then there exist Nq E N and sq > such that for all 
N ^ Nq and s ^ sq we have 



E/|[n]|(C)-E/|[n]|(«) 



We conclude this section by drawing some consequences from Theorem 2.11 In the GOE/GUE case, it 
is easy to see that the law of the block matrix T + 5" is asymptotically Gaussian with covariance 

\d^\ + l 



dl 



ij,kl 



In particular, we find that overlapping outliers repel each other according to the usual random matrix level 
repulsion, while non-overlapping outliers are asymptotically independent. 

In general outliers are not asymptotically independent, even if they do not overlap. Such correlations arise 
from correlations between different blocks of T -I- 5*. There are two possible sources for these correlations: 
the term VgHVs in the definition of T, and the terms TZ and W in the covariance (2.211 of the Gaussian 
matrix Thus, two outliers may be strongly correlated even if they are located on opposite sides of the 
bulk spectrum. 
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3. Tools 



The rest of this paper is devoted to the proofs of Theorems 2.5 2.9 a nd|2.11| Sections [3}|8] are devoted to 
the proof of Theorem |2.9[ Theorem [23] is an easy corollary of Theorem |2.9[ Finally, Theorem |2.11| is proved 
in Section |9] by an extension of the arguments of Sections [3}|8] 

We begin with a preliminary section that collects tools we shall use in the proof. We introduce the 
spectral parameter 

z — E + ir] , 

which will be used as the argument of Stieltjes transforms and resolvents. In the following we often use the 
notation E = Kez and 77 = Imz without further comment. Let 

gix) ^V[4-a;2]+ (x G M) 

ZTT 

denote the density of the local semicircle law, and 

m(z) f^^dx (2 ^[-2, 2]) (3.1) 
J X - z 

its Stieltjes transform. It is well known that the Stieltjes transform m satisfies the identity 

m{z) + ^--+z = 0. (3.2) 
m(z) 



It is easy to see that (3.2) and the definition (2.5) imply 

mie{d)) = -i. (3.3) 

The following lemma collects some useful properties of m. 
Lemma 3.1. For \z\ ^ 21] we have 

|m(z)| X 1, \l-m{zf\ X V'^ + V ■ (3-4) 



Moreover, 

Imm(z) 



if \E\ 2 
if \E\^2. 



/K.+17 



(Here the implicit constants depend on E.J 

Proof. The proof is an elementary calculation; see Lemma 4.2 in [l^. □ 

The following definition introduces a notion of high probability that is suitable for our needs. 

Definition 3.2 (High probability events). We say that an N -dependent event S holds with high prob- 
ability if there is some constant C such that 

P(S=) ^ N'^expi-ip) (3.5) 

for large enough N . 



Next, we give the key tool behind the proof of Theorem |2.9| the Isotropic local semicircle law. We use 
the notation v = {vi)iLi G for the components of a vector. We introduce the standard scalar product 
(v,w) "^^ViWi. For 77 > we define the resolvent of H through 

G{z) := {H-z)-K 



The following result was proved in 21 Theorem 2.3]. 
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Theorem 3.3 (Isotropic local semicircle law outside of the spectrum). Fix S ^ 3. There exists 
a constant C such that for large enough K and any deterministic v, w S we have with high probability 

|(v,G(z)w)-m(z)(v,w)| ^^^^^^^||v||||w|| (3.6) 

for all 

E e [-I],-2-(^^iV"2/3] U [2 + (p-^A^-2/3^5.j ^ g (Q^q^ 

For e M define 

\\E\-2\, (3.7) 
the distance from E to the spectral edges ±2. We have the simple estimate 

Ke^d) - {\d\-l? (3.8) 



for > 1. Using (3.8) and Lemma 3.1 we find that the control parameter in (3.6 1 may be written as 



/ Imm(z) 
Nrj 



N-'/^KE+vr'^" ^ N-"^^e"- (3.9) 



The following result provides sharp (up to logarithmic factors) large deviations bounds on the locations 
of the outliers. 

Theorem 3.4 (Locations of the deformed eigenvalues). There exists a constant C such that, for 
large enough K and under the condition (2.2), we have 

with high probability provided that \di\ ^ 1 + (p^ N^^/'^ . 



Proof. This was essentially proved in [21] Theorem 2.7] by setting ■0 = 1 there; see Equation (2.20) of 21 



Note that Theorem 2.7 of [2l| has slightly stronger assumptions than Theorem 3.4 requiring in addition 
that there be no eigenvalues dj of D satisfying \ \dj\ — l| < (p^N~^/^. However, this assumption was only 
needed for Equation (2.21) of 21 , and the proof from Section 6 of [27 may be applied verbatim to (3.10) 
under the assumptions of Theorem |3.4[ □ 



We shall often need to consider minors of H , which are the content of the following definition. It is a 
convenient extension of Definition 12.21 



Definition 3.5 (Minors and partial expectation). (i) For U c |1, A^] we define 

H^^^ H^ijc^ — {hij)i,j^uo , 
where V ■= |1, A^] \ U. Moreover, we define the resolvent of H'^^^ through 

G(^)(^) (i/(^)-z)-l. 

(ii) Set 

(U) 

E - E • 

When U = {a}, we abbreviate ({a}) by (a) in the above definitions; similarly, we write (ab) instead of 
i{a,b}). 

(Hi) For U C |l,iV] define the partial expectation EuiX) E{X\H^'^'>). 

Next, we record some basic large deviations estimates from j21[ Lemma 3.5]. 
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Lemma 3.6 (Large deviations estimates). Let ai, . . . , oat, foi, . . . , 6m be independent random variables 
with zero mean and unit variance. Assume that there is a constant 'd > such that 



P(|a,| ^ x) < exp(-a;'') (i = 1, . . . , iV) , 



(3.11) 



Then there exists a constant p = p{'d) > 1 such that, for any ^ > and any deterministic complex numbers 
Ai and B^j , we have with high probability 



^aiB^jbj 



1/2 



1/2 



1/2 



(3.12) 
(3.13) 
(3.14) 



We conclude this preliminary section by quoting a result on the eigenvalue rigidity of H. Denote by 
7i ^ 72 ^ • ■ ■ ^ 7Af the classical locations of the eigenvalues of H, defined through 



N 



g{x) dx = a (1 < a < A'') , 



(3.15) 



The following result was proved in 17 Theorem 2.2]. 



Theorem 3.7 (Rigidity of eigenvalues). There exists a constant C such that we have with high proba- 
bility 

|A„-7c.| < ^^(min{a,7V + l-a})"'/'iV-2/3 

for alla = {l^Nj. 



4. Coarser grouping of outliers and reduction to the law of G 



For the following we fix the sequences (Vat) at and {Dn)n. It will sometimes be convenient to assume that 

limd^^^ exists for all i € |1, r] . (4.1) 
To that end, we invoke the following elementary result. 

Lemma 4.1. Let (aAr)jv be a sequence of nonnegative numbers and e > 0. The following statements are 
equivalent. 

(i) o-N ^ £ for large enough N. 

(ii) Each subsequence has a further subsequence along which a^q ^ e. 
We use Lemma 



4.1 



by setting on to be the left-hand side of (2.191. Using Lemma 4.1 we therefore find 



that Theorem 2.9 holds for arbitrary D if it holds for D satisfying (4.1 1. From now on, we therefore assume 



without loss of generality that (4.1 1 holds 



For the proof of Theorem 2.9 we need a new subset of |l,r], denoted by 7(^), which is larger than or 



equal to the subset 7r(^, s) from Definition 2.8 



Definition 4.2. For £ e |l,r] satisfying (2.18), define "/{£) = ^n,d,k{(-) as the smallest subset o/|l,r] with 
the two following properties. 
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(a) If for i,j e we have \di\ > 1 and 

iVi/2(|d,|-l)i/2|d,-d,| sC (4.2) 
then either i,j € j(£) or i,j € 7(^). 
Here we use the notation ^{£) '■— |1, r] \ ^{£). 

Note that j(£) is a set of consecutive integers. Similarly to tt{£,s), the set 7(f) indexes outliers that are 
close to that indexed by £, except that now the threshold used to determine whether two outliers overlap is 
larger (v^^^ instead of the iV- independent s). This need to regroup outliers into larger subsets arises from 
the perturbation theory argument in Proposition 4.5 below. At the end of the proof, in Section [Sj we shall 
use perturbation theory a second time to obtain a statement involving outliers in Tr{£, s) instead of ^{£). 

For the following we introduce the abbreviation 

Sp{d) := ^pN-^/\\d\-ir^/\ 



so that (4.2) reads \di — dj\ ^ 6K/2idi)- We have the following elementary result. 
Lemma 4.3. Let p > 0. If \d\ ^ I + ipPN-'^/^ and \d - d'\ < 5p{d) then 

\d'\-l = (M|-i)(i + o(^-''/2)). 

For brevity, we fix i satisfying ( 2.18| ), and abbreviate 7 = ^{£) and 7 = 7(f) when there is no risk of 
confusion. The indices of 7 and 7 are separated in the following sense. 



Lemma 4.4. // i G 7 and j G 7 then 
Ifi,j e 7 then 



\di - dj\ > 5K/2{di) 



dj\ ^ 2r5K/2{di). 



(4.3) 



(4.4) 



Proof. The bound (4.31 follows immediately from the definition of 7. The bound (4.4) follows immediately 

□ 



from Lemma |4.3| and the fact that 7 is a set of at most r consecutive integers. 
Since D is diagonal, we may write 

D = ® . 

The matrix Z?[.^] has dimensions I7I x I7I and eigenvalues {di)i^^. Define the region 



mm(di - (5if/4(di)) , max(dj + SK/iidi)) 



By (2.181 and (4.4|, it is not hard to see that S C M \ [—1, 1]. For large enough K a simple estimate using 
the definition of 9 and the bound (3.10) yields for alH € 7 



a{H)n0{B) = {fio^mhe 



(4.5) 



with high probability. In^other words, B houses with high probability all of the outliers indexed by 7, and 

find that for large enough K the region 9{B) 



3.7 



no other eigenvalues of H. Moreover, from Theorem 
contains with high probability no eigenvalues of H. 

We may now state the main result of this section. Introduce the r x r matrix 

M{z) V*G{z)V. 

To shorten notation, for i satisfying \di\ > 1 we often abbreviate 

9^ 9{di). 
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Proposition 4.5. The following holds for large enoughK. Let i ^ |l,r] satisfy (2.181, and write j = . 
Then for all i G j we have 



m'i9e) 



{M{9,) + D-^] 



[7] 



(4.6) 



with high probability. (Recall Definitions 2.1 and 2.2 for the meaning of Xi{-) on the left-hand side.) 



Proof. We have to introduce some additional randomness in order to (almost surely) avoid pathological 
coincidences. Thus, let A be an r x r Hermitian random matrix whose upper-triangular entries are indepen- 
dent and have an absolutely continuous law supported in the unit disk. Moreover, let A be independent of 
H. Let £ > 0. We shall prove the claim of Proposition 4.5 for the matrix i/^ ■■— H + V{D~^ + eA)"^^* for 
small enough e (depending on N), instead of H = H -\- VDV* . Having done this, the claim for H follows 
easily by taking the limit e — > 0. 
Define the r x r matrix 

A^'ix) := M{x) -m{x) + D-^ +eA. (4.7) 



From j2lj, Lemma 6.1, we get that x ^ <y{H) is an eigenvalue of if and only if A'^{x) + m{x) has a zero 
eigenvalue. Similarly to Proposition 7.1 in [21| , we use perturbation theory to compare the eigenvalues of 
A'^{x) with those of the block matrix 

A^X) ^f^](x)©^f^](x). 

In order to apply perturbation theory, we must establish a lower bound on the spectral gap 

dist(a(Af^](0,)),a(Af^j(0,))' 



Using Theorem 3.3 (3.8), and (3.9) we find, with high probability, 

dist((T(Af^](0,)),a(Af^](0,))) > dist{a{D^^]),a{D^^])) - Scidi) - e ^ c8K,2{di) ~ 5c[d,) > 5K/2-i{di) 

(4.8) 

for large enough K and small enough e (depending on N), where in the second step we used (4.3). 



Next, Theorem 3.3 (3.8), and (3.9) yield, with high probability. 



(4.9) 



for large enough K and small enough e (depending on N). 
Define the regions 



V := 



U [d, ' - Sk/M , + SK/Mi) 



iG7 



y [d/ - Sk/M) ,d^ + Sk/M 



i£7 



which are disjoint by (4.3). By definition of A'^{9i) and A'^^^^{9e), as well as using (4.4), Theorem 3.3 (3.8), 
and (3.9), we get, with high probability, 

a(^f^j(0,)) C V, a{A%ei)) d VUV 

for large enough K and small enough e (depending on N). Moreover, both A^{9e) and A'^^^{9() have exactly 
I7I eigenvalues in T>; we denote these eigenvalues by {af)i^.y and (af )ig^ respectively. 

We may now apply perturbation theory. Invoking Proposition A.l using (4.8) and (4.9) yields with high 
probability 

' SK/4-2{de)'^^ 



O 



'^K/2 



(4.10) 



for i G J. 
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Next, we allow the argument x of A'^{x) to vary in order to locate the eigenvalues of . We recall the 
following derivative bound from (2T| Lemma 7.2]: there is a constant C such that for large enough K we 
have for all i'^-normalizcd v,w e C^, with high probability, 



|9^Gvw(a;) ~ 9:,m(x)(v,w)| < Lp^N^^'^n-^ for x G 



-S , -2 - ^'^/^N-^'^] U [2 + (^^/2iV-i/3 , s] . 

(4.11) 



By definition of B, we find from Lemma 4.3 (|2.18|, and (|4.4|) that 
X e d{B) =^ 



(de ~ 3rSK/2{di)) x 9{di + ir5K/2{de.)) 



We deduce using Lemma |4.3[ (2.18), and ( |3.8[ ) that 

X - if for a; G 6'(S) 



Therefore from Theorem |3.3| we conclude with high probability 

M{x) = m{x) + 0{5c{di)) for x e e{B) . 



Similarly, from (4.11) we get with high probability 



M'{x) ^ m'{x) + 0{<p^N~^'^{\di\~\)-^) for x e e{B) . 



(4.12) 



(4.13) 



(4.14) 



(4.15) 



With these preliminary bounds, we may vary x e 0{B). Let {ai{x))i^^ denote the continuous family of 
eigenvalues of A^{x) satisfying af{9i) — for i G 7. For the following argument it is helpful to keep Figure 
4.1 in mind. We make the following claim. 



■ ^—m{x) 




X- a{H^)n0{B) x+ 

Figure 4.1: The spectrum of A'^{x) for x E 0{B). For definiteness, we chose 7 — |1,5]. The region x E d{B) 
is delimited by dotted lines. The eigenvalues of are labelled by black dots on the x-axis. 



(*) Almost surely, for all x £ 6{B) we have that a'l{x) — —m{x) for at most one i G 7. 

We omit the standarcj^ details of the proof of (*). Note that the necessity for (*) to hold is the only reason 
we had to introduce the additional randomness A into . 

From the definition of 9{B) one readily finds for alH G 7 that 



af(x_) < — m(x_). 



-m{x+) < af(2;+) 



where x± denote the endpoints of the interval 0(B). Recall that has with high probability exactly I7I 
eigenvalues in 0{B). By continuity of a^(a;) and the property (*) we therefore get that the function —m{x) 



^The proof uses the faet that the law of A is absolutely continuous, that the set of singular Hermitian matrices is an algebraic 
variety of codimension one, and that the set of Hermitian matrices with multiple eigenvalues at zero is an algebraic subvariety 
of codimension two. 
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intersects each function af{x), i G 7, exactly once in 9{B). Let 167 and denote by xf the unique point 
(with high probabiHty) in 6{B) at which af{xf) = —m{xf). 

From the definition of A'^ and (4.15) we get, with high probabiUty, 



mix!) = al{ef) + 0(^'f^N-^/^{\d,\-l)-^\xl-ee\) = < + o((^^/2+c^-5/6(|^^| _ ^)-3/2^ ^ (4_^g) 



where in the second step we used (4.121, the fact that xf e S{B), and the elementary bound |6''(d)| x |d| — 1. 
(Recall that by definition af{9i) = af.) Now we may use (4.10) and (4.16) to get 



with high probability. Now we expand the left-hand side using the identity 

2 

' ^ ^ -1/2 



1 — wr 



(4.17) 



(4.18) 



which follows easily from (3.2); in the second step we used Lemma 3.1 Differentiating again, we get 



m!'(x) X Kx^^"^ ■ From (4.13) we therefore get 

m{xl) - m{e,)+m'{e,){xl-e,) + 0({\d,\-l)-\m-l)5K/2{dt))' 

= m{e,) + m'{et){xi - 0i) + o((^^(M,| - i)-2iv-i) 

with high probability. Combining ( 4.17[ ) and (4.19) yields, recalling ( 4.18[ ) and (4.131, 



(4.19) 



di ;^(«^ + ™(^^)) + 0{ip-'N-'/\\d,\ - 1) V2) 



with high probability for large enough K, where in the last step we used (2.18). Thus we conclude that 



□ 



xt = A,(^0,---i^(A/[,](0,)+I?[;[+£A[,o)+O(^-2iV-V2(|d,|-l)V^ 

with high probability for small enough s (depending on N) . Taking e — ^ completes the proof. 

We conclude this section with a remark on the choice of the reference point 6^ in Proposition |4.5| 
By definition of 7, if ? G then = "/{£). Obviously, the distribution of the overlapping group of 
outliers {^ia(i))iei cannot depend on the particular choice of ^ € 7. Nevertheless, the reference matrix 
9i — :^^pjg^ i^h] i^e) + ^[y]) (4-^ ' depends explicitly on £ S 7 via 9i. This is not a contradiction, however, 
since a different choice of £ leads to a reference matrix which only differs from the original one by an error 
term of order 0(ip ^^ N~^^'^{\df \ — l)^/'^); this difference may be absorbed into the error term on the right- 
hand side of (4.6). We shall need this fact in Section |9] The precise statement is as follows. (To simplify 
notation, we state it without loss of generality for the case 7 = |1, r|.) 

Lemma 4.6. Suppose that 7(1) = |l,r] and that \di \ ^ 1 + tp^N^^/^. Let 



d,d £ 



^K/2+l{dl) , di + 6K/2+l{dl) 



Then for large enough K we have 



m'{e) 

where we abbreviated 9 = 9{d) and 9 = 9{d) 
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Proof. We write 



{M{e) + D-^)\ \§-—^^{M{e) + D-^) 



1 

m'{e) 

1 

m'{e) 



{M{0) - M{6)) 
{m{9) -m{e)) ^ 



1 1 

m'{e) m'{e) 

1 1 



m'{9) m'{e) 



{M{e) + D- 

m{()) + d-^) 



= d + J - d - i + (rf^ - 1) Q - I) + 0{^-^N-''\\d,\ - If'^) 

= 0(^-2^-l/2(|di|- 1)1/2) 



with high probabihty; in the second step we wrote M{9) — M{9) — Jg M'{^) and used (4.15) and Lemma 

; in the third step we 



4.3 



as well as Theorem 



3.3 



( [3!9| ), ( [S^S] ), ( |4.18[ ), and the fact that m"(x) 



used ( |2.5[ ), (3.3), and the assumption that iiT is large enough; in the last step we used that {d — d) ^ 

4v3-^"+vr-i(|rfi| - i)-i. □ 



5. The Gaussian case 



By Proposition 4.5 it suffices to analyse the distribution of the eigenvalues of the I7I x I7I matrix My^^{Oi). 
Recall that 7 == 7(f) may depend on N. To simplify notation, in Sections [5]-[7] we take 7 — |l,r], which 
allows us to drop subscripts [7] and avoid minor nuisances arising from the fact that 7 may depend on N . 
In fact, this special case will easily imply the case of general 7; see Section |8] 

The following definition is a convenient shorthand for the equivalence relation defined by two random 
matrices of fixed size having the same asymptotic distribution. 

Definition 5.1. For two sequences X^q and Yat of random k x k matrices, where k € N is fixed, we write 
X if 



lim(E/(Xjv) - E/(rjv)) = 



for all continuous and bounded f . 



Let $ = be an r X r GOE/GUE matrix multiplied by y/r. In other words, the covariances of 

$ are given by 

E'^>^J^kl = A^J■,fcZ, (5.1) 



where Aij ki was defined in (2.10) 



Proposition 5.2. The following holds for large enough K. Let = 9{d) for some d satisfying \d\ ^ 
1 + ip^N~^/'^. Suppo se moreover that H is a GOE/GUE matrix. Then 

N'/^i\d\-iy/'{Mie)-mie)) ^ 

\d\V\d\+T 

Proof. Throughout the proof we drop the spectral parameter z = 9 from quantities such as M{9). By 
unitary invariance of H, we may assume that Vij = 5ij, i.e. v^*^ is the «-th standard basis vector of C^. By 
Schur's complement formula, we therefore get M = where B = {Bij)l is the Hermitian r x r matrix 
defined by 

(l-r) 

B^j hij-9- ^ ^iaGlj^b ''^hj. 

a,b 
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We now claim that 



(l-r) 



Bearing later applications in mind, we in fact prove, for any £ d N, that 

g{x) 



dx 



(5.2) 



(5.3) 



with high probability. Applying (5.3) with £ = 1 to the minor H^^'"^^ immediately yields (5.2). In order to 
prove (5.3), we use Theorem 3.7 to get 

1 



E 



(A„ - ey ^ (7a 



1 



C 



- l7o|)^+i 
(a/7V)-i/3 



N E ((a/7V)2/3 + ^^Y+i ^ ^ 7o (a;2/3 + hqY+i ^ ^ 



-1/3 



(5.4) 



with high probability, where in the first step we used that |Aq — 7^1 <C |6'| — |7q| with high probability by 
Theorem 3.7 and the assumption on 6 (for large enough K), and in the second step the estimate 



2-|7„| X a2/3^-2/3 
for a ^ N as follows from the definition of 7^. Similarly, setting 70 ■— —2, we find 



(5.5) 



N 



[x - ey 



N 



dx = 



g{x) 



^ 1 

E 



dx 



AT 

E 



(5.6) 



Now (|5^ follows from (|5^ and ( |5.6[). 

Using Ehiahtj = SijSabN"^ and (3.8) we therefore get from (5.2) 

(l---r) (l---r) 

E h^aG'^ab "'^htJ-6,,m = (1-Ei...,) J2 h,aG'^X"'^htj+0{ip''N-\d~l)-^] 

a,b a,b 

with high probability. We may therefore write 

B^-i = -6* - m - (-hij + W^j + Rij) , 

where 

(1-0 

W,, := (1-Ei...,) h,aG'^^;-''hb, and i?,, = 0{^^ N-^dl - l)"^) 

a,b 

with high probability. 
Next, we claim that 

with high probability. Indeed, using Lemma |3.6| we get 

X 1/2 / , X 1/2 



(5.7) 



^ ^^(^ ^ |Gir-^T) = ^^(^Tr(G(--)*G(--))] ^ ^^A.-V2(|,|_i)-i/2 
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with high probability. In the last step we used (5.3), (4.11 ), and G = G* to get (dropping the upper indices 
to simplify notation) 



7V2 



Tr(G*G) = N-^m' + Oiip^N~^K~^) = 0{N-^ Kg^^^ + ip^ N'^ k^^) = 0{N-^{\d\ ~ 1)-^) 



with high probability. 

Using the bounds (5.7) and \hij\ ^ ifP N~^l'^ with high probability (as follows from (2.1 )), we may expand 
with (3.2 ) to get 

A% = m5,j + m2(-% + W,j) + 0((^^iV-i(|d| - 1)-^) 
with high probability. Let — denote the upper r x r block of H. Thus we get 

7Vi/2(|d| - l)i/2(Af - m) = m^N^I'^{\d\ - l)^^^~H[i...r] +W) + 0{ip^ N~^^^{\d\ - l)-3/2) (5.3) 

with high probability. In particular, for large enough K we get 

7Vi/2(|d| - l)i/2(M -m) ^ m^N^/^{\d\ - lY^^-H^,...,] + W) . (5.9) 

By definition, H^i...^] and W are independent. What therefore remains is to compute the asymptotic 
distribution of W. We claim that W converges in law to an r x r Gaussian matrix: 



(5.10) 



By the Cramer- Wold device, it suffices to show that 

7VV2(|d|_ 1)1/2 ^Q,,^.t^,^ 



for any deterministic matrix Q ~ (Qij) satisfying Q — Q* and Qij e M if /3 = 1. Let {Aa)a=r+i denote the 
eigenvalues of N~^/^{\d\ — 1)^/^G^^ ' ''''. Then by unitary invariance of the vector {hai)a=r+i fo^' i ^ Pi ''I 
and the fact that H^^'"'''^ is independent of the family [hia : i ^ r, a ^ r + 1) we have 

r r 
— 1 a 



Note that Q 
with variance 2/3^^ TrQ^. Therefore 



N 



a—r+1 



is a family of i.i.d. random variables, independent of (Aj,), 



2 ''-"^ 2 

EX^ = ^TrQ2 E = ^TrQ2iv-i(|d|-l)Tr(G(i-'-))2 



/3 



Tr 



g2((|d|-l)m' + 0(^'^7V"i(|d|-l)-3)) 



TrQ2(^(|d| - l)m' + 0((^-i) 



with high probability for large enough K, where we used (5.3). Moreover, we have 

(l-r) 

^ A* = 7V-2(|d| - l)2Tr(G(i-'-))4 = N~^i\d\-l)^{Nm"'/6 + 0iip^i\d\-l)-^)) 

a 

= 0(iV-l(|d|-l)-3+iV-2(|rf|_l)-6) ^ 0(^-1) 



with high probability for large enough K, where in the second step we used (5.3 ) an d in the third step the 
estimate m'" x k^^^^ as follows by differentiating (4.18) twice and from Lemma 3t] 
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We conclude from the Central Limit Theorem that 



where we used the identity 



(Ml - 



Ml + i 



as follows from (4.181 and (3.3). Thus ( |5.10 1 follows the identity 



TrQ^ , 



as follows from a from a simple variance calculation. 
Next, by definition of H\i...^-i we have 



Thus we find 



-iVi/2(|d|-l)i/2i7[,_j A 1)1/2$ 

iVV2(|rf|_l)l/2(_^ £ Ml 



The claim now follows from (5.9) and (3.3) 



□ 



6. The almost Gaussian case 

The next step of the proof is to consider the case where most entries of H are Gaussian. 

Proposition 6.1. The following holds for large enough K. Let 9 = 6{d) for some d satisfying \d\ ^ 
1 + if^ N^^/^ . Let p ^ 2. Suppose that the Wigner matrix H satisfies 

max max{|l/ii|, ly^/l} ^ ip^'^ hij is Gaussian. (6-1) 

Then 

N^'^{\d\~lf'^{M{e)-m{9)) - ~N^'^{\d\-lY'^d-^ViHVs + ^o, 
where ^'q = '^*^ is a Gaussian matrix, independent of H , with centred entries and covariance 

Proof. Throughout the proof we drop the spectral parameter z = 9 from our notation. 

Step 1. We start with some linear algebra in order to write the matrix M in a form amenable to analysis. 
Since ||v(')|| = 1 for ah I we find that 

\{i:\Va\>^-']\ s=; <^'^ 

We shall permute the rows of V by using an iV x permutation matrix O according to M — V*GV = 
{OVy OGO* OV. It is easy to see that we may permute the rows of V by setting V i->- OV so that after 
the permutation we have 

where 

(i) U is a n X r matrix and W an (N — fx) x r matrix, 
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(ii) \Wii\i^ ip-P for all i and I, 

(iii) ^ ^ rip'^f. 

After the permutation H h- > OHO*, we may write H as 



H = 



A B* 
B Hoi ' 



where A is a, fj, x fj. matrix, B an (N — fj.) x fj, matrix, and Hq an (N — fi) x (N — fj.) matrix with Gaussian 
entries (as follows from (6.1)). 

Next, we rotate the rows of W by choosing a unitary {N — ^) x (N — /i) matrix S such that 



SW 



where is an r x r matrix that satisfies 

U*U + W*W ^ U*U + W*W 



(6.2) 



Thus we get 



M = V 




A-e 


B* 


B 


Ho- 




(^\ 


r 


W 







1 

s* 



1 

s 



V 



where = denotes equality in distribution. Here we used the unitary invariance of the Gaussian matrix Hq. 
Next, we decompose 



Hn = 



Hi Z* 
Z H2 



S = 



where Hi is an rxr Gaussian matrix, Z an {N — fj,—r) xr Gaussian matrix, and H2 an [N — fi—r) x (iV— /i— r) 
Gaussian matrix. Moreover, i? is an r x (N — 11) matrix and we have 



Thus we find 



M 



RR* 




SS* 


= '^N-fj.-r , 


RS* 




/A-e 


B*R* 


B*S* \ 




f) 


RB 


Hi-e 




(!) 




V SB 


z 


H2-e 





where 



Y : = 



F := {SB,Z), 



0, 



A : = 



R*R + S*S 



A-e F* 
F H2- 



f A B*R' 
\RB Hi 



l-N-fi 



e. 



Here y is a (/i + r) x r matrix satisfying Y*Y = Ir, and F is an {N — ji — r) x {fi + r) matrix. 
Step 2. We claim that 



F*F = 1 



0((^^iV-l/2) 



(6.3) 



with high probability (in the sense of matrix entries). In order to prove (6.3), we write 

F*F = 



B*S*SB B*S*Z 
Z*SB Z*Z 



and consider each block separately. For i ^ j we get using (3.14) 



\{B*S*SB] 



Bki(S* S)kiB, 



k,l 



N 



k,l 



1/2 



,C 



N 



{Tr{S*SfY^' (^^TV 
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with high probabihty. Similarly, ( |3.12[ ) and (3.13) yield 



k^l 



with high probabihty, where we used that N''^ Tt S* S = 1 - (/i + r)N~'^. Next, from ( |3.12[ ), ( |3.13| ), and 
(3.14) we easily get 

Z*Z = lr + 0{<p^N-^'^) 



(6.4) 



with high probability. Finally, (3.14) yields 

\{B*S*Z)i-j\ = BkiSliZi 



k,l 



< 



N 



k.l 



1/2 



N 



with high probability. This concludes the proof of (6.3 1. 
Next, we define 



and claim that 



G2 := 



F*G2F = TO + 0(</'^Ar-i/2(|rf|_i)-i/2) 



(6.5) 



with high probability (in the sense of matrix entries). Since N^/-^{N—fj,—r) ^^^H2 is an (N—fi—r) x {N—fi—r) 
GOE/GUE matrix that is independent of F, follows from Theorem [331 P^ , ^^'^ @- 

Step 3. For the following we use the letter £ to denote any (random) error term satisfying \£\ < 
ip'^ N~^{\d\ — 1)^^ with high probability for some constant C. We apply Schur's complement formula to get 

e = Y*(-e-m- {-A + F*G2F~m)^ V 

= mY*Y - m^Y*AY + (y*F*G2FY - mY*Y^ + £ 
= m-m^Y*AY + m^(Y*F*G2FY ~m] +£ 



where in the second step we expanded using (3.2| and estimated the error term using (6.5), /i ^ ip , and 
\\A\\ (/3C'jY-i/2 ^j^j^ jjjgj^ probability. Using R*W = 1^ we get 

e = m-m'^{U*AU + U*B*W + W*BU + W*HiW) +m^Y*F*{G2-m)FY + m^{Y*F*FY -l)+£ . 

Next, we rewrite the term Y*F*{G2 — m)FY so as to decouple the randomness of H2 from that of F. From 
(|6.3|) we find 

0((^^iV-i/2) 



Y*F*FY = 1 

with high probability. Define the deterministic (N — 11 — r) x r matrix 



(6.6) 



Using (6.61 as well as Gaussian elimination on the matrix FY, it is not hard to see that there is a unitary 
{N — II — r) X {N — fj, ~ r) matrix Oi, which is F-measurable, such that 

||OiFy-£:i|| ip^N^^^^ 

with high probability. Using Theorem |3.3| and the fact that F and H2 are independent, we therefore get 

{OiFY)* {G2 - m)OiFY ^ El{G2 - m)Ei + £ . 

We conclude that 

M = m-m\U*AU + U*B*W + W*BU + W*HiW)+m^El{G2~rn)Ei+m^{Y*F*FY-l)+£, 
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where we used that O1G2OI = G2 and that all terms apart from m^El{G2 — m)Ei are independent of H2- 
Next, we compute 

Y*F*FY = U*B*S*SBU + U*B*S*ZW + W*Z*SBU + W*Z*ZW 

= U*B*BU - U*B*R*RBU + U* B* S* ZW + W*Z*SBU + W* Z* ZW 
= U*B*BU + U*B*S*ZW + W*Z*SBU + W*Z*ZW + 0{lp^ N^^) 

with high probability, where in the last step we used Lemma 3.6 and Tr(_R*_R)^ — r. Using (6.2) we rewrite 

^JL + r'. 



U*B*BU + W*Z*ZW -1 = m{U*B*BU + W*Z*ZW) - -^U*U - ^ 

where we introduced the notation lEX .= X — EX. 
Thus we conclude that 



W*W, 



M -m 



Gi + 62 + 63 + 84 + f . 



(6.7) 



where 

61 rr? El{G2 ~ m)Ei . 
63 -m'^W*HiW, 



U*B*W + W*BU) + m^m[u*B*BU + U* B* S* ZW + W*Z*SBU + W* Z* Zw) . 



64 -r 



By definition, the random variables 0i, 82, 83, and 84 are independent. 

Step 4- We compute the asymptotics of 81, 82, and 83. Since n + r ^ ip'~' , we may apply Proposition 



5.2 to the {N — ^jl — r) x {N — fi — r) Gaussian matrix H2 to get 

1 



ivi/2(|d|-i)i/2e^ £ 



Here we used (3.3). Recall that $ is the rescaled GOE/GUE matrix satisfying (5.1). 

In order to deal with 82, we introduce, in analogy to Vs, the matrix Us — (Ufi) whose entries are defined by 
.= Uul{\U^i\ > 5). In particular, since S ^ ip''^ ip'P, we have Vs = (^■'). Writing Us = (Ufi) ■■= U -Us, 

we get 

U*AU = U*sAUs + U*sAUs + U^AUs + U^AUs . 

Since \Ufi\ < S, the Central Limit Theorem implies that := U^AUs + U^AUs and -^2 UgAUs are 
asymptotically Gaussian, and a simple calculation yields the covariance 

iVE(1'i)y(*i)fc, = 2%,m{U*sUs,U*sUs), A^E(^'2).,(*2)fe( = %jm{U*sUs,U*sUs) , 

where we defined 

Tij,ki{R,T) ■■= -^(^RiiTkj + RkjTu + 1{(3 — l)(^RikTji + RjiTik^ 
Similarly, 83 is Gaussian with covariance 

A^E(83).,(83)fe; = d-^T^JMW*W,W*W), 



where we used (3.3). Using UgAUs = VgHVs we therefore conclude that 



1)1/2(9, + + 83 



1 



\d\^VW+i 



$_7V 



1/2 (Ml -1)^/^ 



Vs*HVs + *3 



(6.8) 



where ^'3 is Gaussian with covariance 



E(*3).,(* 



\d\-l 
d^ 



(2%jm{U*sUs, U*sUs) + %j,kiiU*sUs, U*sUs) + T^jm{W*W, W*W)) . (6.9) 
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Step 5. Next, we compute the asymptotics of Q^. We shall prove that A^^/^(|d| — 1)^/^04 is asymptotically 
Gaussian, and compute its covariance matrix. 
Using Lemma |3.6| we find 



uBkiBij + '^^{S* S)kkBkiBkj 

k^l k 

= ^'^{S* S)kiBkiBij + ''^{S*S)kk i BkiBkj - 

k^l k ^ 

with high probability. Define the deterministic (N — fi — r) x ^ matrix 

In 



N 



N - fi-r , 

c 

N 



(6.10) 



Exactly as after (6.101 we find that (6.10) and Gaussian elimination imply that there is a unitary {N — /i - 
r) X {N — /i — r) matrix O2, which is i?- measurable, such that 

with high probability. Thus we get 

\(W*Z*{02SB - E2)U),] = 



Y.W*kJ2{i02SB ~ E2)U) Zik 



1/2 



< ip^N-^^^ (U* {O2SB - E2y {O2SB - E2)U^^ 

with high probability. Using that Z is independent of B and O2 , we therefore find 

84 ^ -m^{U*B*W + W*BU) +mH¥.{u*B*BU + U*E*ZW + W*Z*E2U + W*^ 
Defining the [N — ij, — r) x r matrix 



U := E2U 



u 

(N-fj,-2r)xr 



we therefore have 
where 



64 = e', + e'l + s, 



61 



e:; := m-' 



■{U*B*W + W*BU) + mHE{U*B*BU) , 

u*zw + w*z*u + m{wz*zw)) . 



By definition, 64 and 84 are independent. Recalling that \Wii\ ^ ip we find from the Central Limit 
Theorem that N^/^Q'^ and N^/^B'l are each asymptotically Gaussian. Hence it suffices to compute their 
covariances. A straightforward computation yields 

NE{Q'^),,{Q'^)ki = 2m^%j.ki{U*U, W*W) - m^Q,,, W) + m'' {%,,ki{U*U, U*U) + 7^,,■ , 

where we defined 



Q^JMiU,W) := N-'/^'Y,{Ua^UakUal^i^abWb,+W,a^^^^,^UbJUbkUb, 
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(By a slight abuse of notation, we write Tlij,ki{U) by identifying U with the N x r vector (^) 
We may similarly deal with 9^'. Using U*U =^U*U and W*W = W*W wc find 



Combining Q'^ and 9", and recalling (3.3), we find 



A^E(94).,(94)fci = 2d-^%jMiU*U,W*W) + d-''Q,jM{U,W) + d-'^{A,,M +n,,MU)) , (6.11) 
where we used that 

%,MU*U,U*U) + %jMW*W,W*W) + 2T,^,ki{U*U,W*W) = 7-,,fe/(l,l) = A,j,ki, 

as follows from W*W + U*U = 1. 

Step 6. We may now consider the sum 9i + 92 + 93 + 94. From (6.7), (6.8), (6.9), (6.11), and the 
definition of 5 , we get 

N^/^{\d\~lf'^{M ~~m) ^ -iVi/2(|d| - 1)1/2^-2 p^;//]/^- + 

where ^'4 = is a Gaussian matrix, independent of H , with covariance 



E(*4).y(*4)fci 
\d\-l 



d^ 



A 



ij^kl 



v.. 



ij^kl 



iVs*Vs)) 



\d\ 



d^ 



\d\-l 

d6 



d'{\d\ + 1) 



Here we used that 



2%jm{UsUs, U;Us) + %j,kiiUsUs, U^Us) + %jm{W*W, W*W) + 2%,m{U*U, W*W) 

= A,j,ki-%jMiU*sUs,U^Us) = A,,M-V.,jM{ViVs), 

as follows from the bilinearity of Tij.kii'^ well as the identities Tij.kii'^, 1) — Aijki, 1 = UgUs + UgUs 
W*W, and U^Us ^Vs*Vs. 

Using that f7 is a /i x r matrix with /i ^ rip'^P and \ Wii\ ^ 'fi^'^, we easily find that 



(6.12) 



Since p ^ 2, it is not hard to see that the errors on the right-hand side of (6.12) are bounded from above (in 



the sense of matrices) by the matrix Eij ki — '-p Ay j,;. In particular, from (6.11) we get that the matrix 



2d-^%jM{U*U, W*W) + d-5Q„- fc,(F) + d-6(A,j- fcj + ■R,,m{V)) + E,,^ki 



is nonnegative, from which we conclude that the right-hand side of (2.11) is nonnegative. This completes 
the proof. □ 



7. The general case 



The general case follows from Proposition |6.1| and Green function comparison. The argument is almost 



identical to that of Section 7.4 in 21 , and we only sketch the differences. 

Let H = {N~^/'^Xij) be an arbitrary real symmetric / complex Hermitian Wigner matrix and {^N~'^/'^Yij) 
a GOE/GUE matrix independent of H . For p > define the subset 

Ip ■■= {i e ll.Nj : \Vu\ s=; (p-P for ah I e [l,r]} . 

Define a new Wigner matrix H = (N^^^^Xij) through 



X, 



Yij if i € /p and j e Ip 
Xij otherwise . 
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Thus, H satisfies the assumptions of Proposition 6.1 Let 

Jp — ■■! j ^ N , i e Ip and j e Ip} 

Choose a bijective map (f> : Jp ^ {1, . . . ,\Jp\}. For 1 ^ t ^ \Jp\ denote by — the Hermitian matrix 
defined by 



N^^/'^Xij otherwise 



< j) ■ 



In particular, _ffo = H and H\j^\ = H . Let now (a, b) G Jp satisfy (/)(a, h) = r. We write 
and 

Here iJ^"'') denotes the matrix with entries E^j''^ := daiSbj- Hence we have Qab — Qba = 0, and the matrices 
Hr-i and Hr differ only in the entries (a, b) and (&, a). 



Next, we introduce the resolvents 

Q - Z 



5(z) 



1 



T{z) 



1 



Hr-Z 



Let |d| > 1 + (^-^iV-i/a. Set z := 6'(d) + iiV-" (as 



21 



Section 7.4], we add a small imaginary part to z to 

(7.1) 



ensure weak control on low-probability events) and define 

XR := N^^^{\d\-iy/^{V*R{z)V -m{z)) . 

The quantities xs and xt are defined analogously with R replaced by S and T respectively. 

The following estimate is the main comparison estimate. It is very similar to Lemma 7.13 of 21 . 

Lemma 7.1. Provided p is a large enough constant, the following holds. Let f S C^{C^^^) be bounded with 
bounded derivatives and q = be an arbitrary deterministic sequence of r x r matrices. Then 

E/(XT +q) = E/(.T^ + q)+i2 + + + 0{ip-'£ab) , 



Efixs+q) = EfixR + q)+Aab + 0{^~^£ab): 

where Aab satisfies \Aab\ ^ 'P^^, 



(7.2) 
(7.3) 



-N-\\d\ - l)^/'(mVitV„n, + mVi'Vb.K,) 



£ab 



J2 ^ ^-2W2+./2|p.^,^|.|^^^.|.+^^^^^^-lW2|V.^^ 



(7.4) 



Proof. The proof follows the proof of Lemma 7.13 of 21 with cosmetic modifications whose details we 
omit. □ 



Using Lemma 7.1 we may now complete the proof in the general case. The following proposition is the 
main result of this section, and is the conclusion of the arguments from Sections [5]-[7] 



Proposition 7.2. The following holds for large enough K. Let 9 = 9{d) for some d satisfying \d\ ^ 
\ + tp^N~^/^. Then 



iVi/2(|d| - 1)1/2 



d2 



ViHVs 



(Ml - lY'^sjv) 



where is the Gaussian matrix from Proposition \6.1 



Proof. The proof follows the proof of Theorem 2.14 in Section 7.4 of 21 with cosmetic modifications whose 
details we omit. The main inputs are Proposition QA_ and Lemma |7.1[ The imaginary part of the spectral 
parameter z = 9{d)+\N~'^ is easily removed using the estimate m(z) = —d + 0{N~^). The condition f G 
in Lemma [7.1 1 can be relaxed to / S C by standard properties of weak convergence of measures. □ 
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8. Conclusion of the proof of Theorems 2.5 and 2.9 



We may now conclude the proof of Theorems |2 . 5| and |2 .9| First we note that Theorem |2.5| is an easy corohary 
of Theorem |2.9| We focus therefore on the proof of Theorem |2.9[ 



Fix K to be the constant from Proposition 7.2 Fix £ e [l,?"] and define the subset 

A := {iVeN: Mf^l ^ i + Z^iV-i/--'}. 



We assume that A is a subsequence (i.e. infinite), for otherwise the claim of Theorem 2.9 is vacuous. For 
given s > we introduce the partition 

A = |jA,,^(s), (8.1) 

7,7r 

where the union ranges over subsets tt, 7 of |1, r] satisfying € tt C 7 C [1, f], and 

A^^^is) := {N eA: -fN{£) = 7 , t^n{^, s) = tt} , 



where 7rjv(^) = 7(f) and 7rjv(^) = 7(f) are the subsets from Definitions |2.8| and |4.2[ 
We shall prove the following result. 

Proposition 8.1. Fix £, n, and 7 satisfying ^ e tt C 7 C Let e > be given, and let fi,. . . , fr be 

bounded continuous functions, where fk is a function on MJ' satisfying ||/fe||oo ^ 1- Then there exist constants 



No and sq, both depending on e and fi, . . . , fr, such that (2.19 ) holds for all s ^ sq and all N ^ Nq satisfying 

N e A^^^{s). 

Before proving Proposition |8.H we note that it immediately implies Theorem |2.9[ since the partition 



.1) ranges over a finite family containing 0(1) elements. 



Proof of Proposition 18.11 From (|4.5| we know that 9{B) contains with high probability precisely I7I 



outliers, namely {fJ'a{i))ie-y Following (2.141, for i e 7 we introduce the rescaled eigenvalues 

C. = 7Vl/2(|d,|-l)-l/2(^^(^)_0,). 

In order to identify the asymptotics of we introduce the I7I x I7I matrices 

X = Xn ■■= -N^^Wde\-iy/H\di\ + l){M[^]{9,)-m{9i)), 
Y ^Y^ := -N'f'{\d,\~iy/^\d,\ + l){D-]-d-'). 

Note that X is random and Y deterministic. From (4.6 1, (3.3), and ( 4.18[ ) we get for alH € 7 that 

\Q-kiX + Y)\ s$ (^-1 (8.2) 



with high probability. By Proposition 7.2 and Remark |2.3[ the family {Xj\j)n is tight. 
By definition of tt and Lemma 4.3 if i G tt and j € then 



\d,-dj\ > siV-l/2(|rf^|_i)-l/2/2. 



(8.3) 



We have the splitting 

We shall apply perturbation theory to the matrix X -\- Y. In order to do so, we truncate X by defining 
X* := A:i(||A:|j < t) for t > 0. Then by tightness of X there exists a t = t{e) > such that 



V{Xn ^X%) ^ - 
5 

for all N. For the truncated matrices we find the spectral gap 

dist(a(X*,] + , <j{Xl^^^^ + Y[^\,])) > dist((7(Y[„]) , <y{Y[^\^]) 



(8.4) 



2t > cs-2t. 
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where the constant c only depends on E m (2.2); here in the last step we used (8.3). Proposition A.l therefore 
yields 



|A,(x* + r)-A,(X[V]+rM)| 



cs-2t- 2t2 



(8.5) 



We conclude that for there exists an sg an iVo, both depending on e and /|^|, such that for s ^ sq 
and N ^ Nq satisfying N e A-^,~^{s) we have 



< 



< 



2£ 

IT 



where in the first step we used (8.4 1, in the second step (8.5) and dominated convergence, in the third 
step (8.4) again, and in the last step (8.2) and dominated convergence. Proposition 8.1 now follows from 
Proposition 7.2 applied to the |7r| x \tt\ matrix 



9. The joint distribution: proof of Theorem 2.11 



In this final section, we extend the arguments of Section |4[|8] to cover the joint distribution of all outliers, 
and hence prove Theorem |2.11[ 

We begin by introducing a coarser partition P, defined analogously to 11 from Definition |2.10[ except that 



7r(£, s) is replaced with "/{£) from Definition 4.2 



Definition 9.1. Let N and D be given, and fix K > 0. We introduce a partitioi^T = T{N,K,D) 
subset o/|l,r], defined as 

V {7(^):^eIl,r],|d,|^l + (p^'7V-i/3}. 
We also use the notation P — {7}7Gr- 



It is immediate from Definitions 



2.10 



and 



9.1 



unique) 7 G P such that tt C 7. In analogy to ( ?T^0[ ), we set for definiteness 



that [n] C U7er 7 ^^'^ tha± for each tt e 11 there is a 



d-y := niinjdi : i G 7} , 9-y := 9{d^ 



Note that for tt e 7 we have 



d^ 



= 1 + 0(1) 



M^I-1 
M7I - 1 



= 1 + 0(1) 



(9.1) 



The following result follows from proposition 4.5 and (4.18), 



Proposition 9.2. The following holds for large enough K . For any 7 e P and i d "f we have 

.W-A,(^07-«-l)(Af(07) + Z?-i)j^j) sc ^-iiV-i/2(|d^|_ 1)1/2 
with high probability. 



(9.2) 



As in the footnote to Definition 2.10 it is easy to see that F is a partition. 
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As in Section [8] we may assume without loss of generality that the partitions 11 and T are independent 
of N. (Otherwise partition 

N = U An , An.r(s) := {NeN: r(iV, K,D)^r, U{N, K, s, D) =IV] . 
r,n 

Since the union is over a finite family of 0(1) subsets of N, we may first fix T and 11 and then restrict 
ourselves to iV e Ar,n(s).) As in the proof of Proposition [8^ we define for each tt G F the |7r| x |7r| matrix 



A- := -N^/Wd,\~\f/Wd,\ + \){M{B,)-m{e,)) 



The joint distribution of {X^)t,(zii is described by the following result, which is analogous to Proposition 7.2 
Proposition 9.3. For large enough K we have 



(9.3) 



ttGII 



where T'^ and '^'^ were defined in Section \2.4\ 

We postpone the proof of Proposition 9^ to the next section, and finish the proof of Theorem 2.11| first. 
In order to identify the location of Q , we invoke Proposition 9.2 and make use of the freedom provided by 
Lemma 4.6 in order to change the reference point 6^ at will. Thus, Proposition 9.2 and Lemma 4.6 yield, 
for any tt G H, i G tt, and 7 G F containing tt, that 

C: = A^^/'(M.|-l)-^/'(MaW-0O = -N'^HK\^iy^HK\ + l)X^{{Mie^) + D-%^)+Oiip-') (9.4) 



with high probability, where we used ( |4.18 ), (9.1), and Lemma A. 2 



Next, for tt G 11 let 7(7r) denote the unique element of F that contains n. For each tt G 11 we introduce 
the |7(7r)| x |7(7r)| matrices 

X- -N^/^{\d^\^iy/^\d^\ + l){M{9^)~m{e^))^^^^^^, 
:= -7VV2(|d^|_i)i/2(|rf^| + i)(^-i_d;i)j^(^^,. 



Thus (9.4 1 reads 



with high probability. By Proposition 



Cr = A,(A- + y-) + 0(^-1) 

X'^ is tight (in N). We may now repeat verbatim 

following (Isll). The 



7.2 



and Remark 



2.3 



.1 



the truncation and perturbation theory argument from the proof of Proposition 
conclusion is that there exists an sq and an Nq, both depending on e and /|[n]|7 such that for s ^ sq and 
N ^ Nq we have 



E/|[n]|((Cr).en,«E.) -E/|[nj| (A.[(A- + r-)[^]]) 



The claim now follows from Proposition 9.3 and the observation that (A'^)[^] = X'^ . This concludes the 
proof of Theorem |2.11| 



9.1. Proof of Proposition 



9.3 



9.3 



Clearly, it is a generalization of 



What remains is to prove Proposition 
Proposition |7.2| The proof of' Proposition |9.3| relies on the same three-step strategy as that of Proposition 
|7.2[ the Gaussian case, the almost Gaussian case, and the general case. 
We begin with the Gaussian case (generalization of Section [5]). 

Proposition 9.4. Suppose that H is a GOE/GUE matrix. Then for large enough K we have 



^N''\\dA^lf"{Am)-m{e.)),^. ^ 



1 



TTsn 



here ($7r)7ren is a family of independent Gaussian matrices, where each '^■^ is a \t:\ x |7r| matrix whose 
covariance is given by (5.1). 
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Proof. The proof is a straightforward extension of that of Proposition 5.2 and we only indicate the changes. 
For each argument 0^, we use Schur's complement formula on the whole block |l,r]. Thus, instead of (5.81, 
we get 

N^'Wd,\-lf'^{M{6,)^m{e,)) 

^ d-'N^'\\d^\ - + w{e^)) + o{^^N-^'\\d^\ - . 

This gives 

N^'WdA - if'^{M{e^) m{e^)) ^ d-^N'/\\d^\ - + wie^)) , (9.5) 



Tren 



Trea 



which is the appropriate generalization of (5.9). By definition, H[i...r] is independent of the family of matrices 
{W{0Tr))'Ken, and the submatrices iJ[-7r], tt € 11, are obviously independent. We may now repeat verbatim 
the proof of (5.101 to get 



The claim now follows from (9.5). 

Next, we consider the almost Gaussian case (generalization of Section |6]). 
Proposition 9.5. Let p > 0. Suppose that the Wigner matrix H satisfies 



max max{|l/ji|, |V,7|} (p 



hij is Gaussian . 



(9.6) 

□ 

(9.7) 



Define T to be the matrix T without the shift arising from S{V), i.e. T = ®jj.en with 

:= (M.| + i)(M.|-i)V^(^^^^'|^)^^^. 

Then for large enough K we have 

0X" £ 0(T- + *-). (9.8) 

Proof. We start exactly as in the proof of Proposition |6.1[ We repeat the steps up to (6.7 1 verbatim on 
the family of r x r matrices (Af(0Tr) — "i(^7r)) ^^^jj, whereby all of the reduction operations are performed 
simultaneously on each matrix M{9t^) — m{9.,,). Note that these matrices only differ in the argument ^tt! 
hence all steps of the reduction (and in particular the quantities O, Oi, ?7, W , W, A, B, Hq, Hi, Z, etc.) 



without further comment. Thus we are led to the following generalization of (6.7): 

d 



are the same for all matrices M{9t;) — m{6T^). We take over the notation from the proof of Proposition 6.1 

; are led to the following generalizatioi 

X" ^ Oi + 62 + 63 + + e'l , (9.9) 



Tren 



where 



61 


Tren 


-N^^Wd 


62 


-®( 

Tren 


V/2(|d,| 


83 


-®( 

Tren 


V/2(|d,| 




-®( 

Tren 


V/2(|rf,i 




-®( 

Tren 


V/2(|d,i 



W*HiW 



7-3 



u*zw + w*z*u + m{wz*zw) 



[tt] 
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(We deviate somewhat from the convention of Section [6] in that, unhke there, we include the normalization 
factor, which depends on tt, in the definition of the variables &.) By definition, the random matrices Oi, 
62, 63, 64, and 64 are independent. They are all block diagonal, and we sometimes use the notation 
01 — ©^gn Qi etc. for their blocks. What remains is to identify their individual asymptotic distributions. 
The matrix is Qi is easy: from Proposition |9.4| we immediately get 

e,. ® ^.., 



where ($7r)7rGn is defined as in Proposition 9.4 The matrix Q2 is dealt with in the same way as in the proof 



of Proposition |6.1| we omit the details. By definition, is Gaussian with mean zero. A short computation 
yields the covariance 



' Ji (Mp|-1)V^(Mp| + 1) 



%jm{W*W,W*W) 



for 7r,7r' £ H, i,j £ tt, and k,l £ tt'. We may therefore conclude that, similarly to (6.8) and (6.9), we have 



(81 + 62 + 63) ^ 



(9.10) 



Tren 



where ^^^^ is a block diagonal Gaussian matrix with mean zero and covariance 



E(*3)u(*3 



'kl 



n 

\p — TT,TT' 



{K\-iy/'i\d,\ + iy 



di 



X 2%j,h{U;Us. U*sUs) + %jMU*sUs, U^Us) + T^j,ki{W*W, W*W) (9.11) 



for 7r,7r' £ H, i,j £ tt, and k^l £ tt'. 

Next, we deal with 64 and 64. By the Central Limit Theorem and the definition of W ^ as in the proof 
of Proposition 6.1 both of these matrices are asymptotically Gaussian (with mean zero). The variances may 
be computed along the same lines as in the proof of Proposition |6.1[ The result is, for 7r,7r' £ 11, i,j £ tt, 
and fc, I £ tt'. 



E(6;).,(6^jfei 



n 



{\d,\-iY'\\d,\ + i) 

dl 



X \^T,j^ki{U*U, W*W) + -^{%j,kiiU*U, U*U) + n^,,ki{U)) 



7V-1/2 



a,h 



as well as 



n 



{\d,\-if/\\d,\+i) 



\p— tt.tt' 

Putting everything together, we get 



dl 



2%jm{U* U,W*W)+ %,m {W* W, W* W) 



(9.12) 
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where ®^gn ^4 ^ Gaussian block diagonal matrix with mean zero that is independent of i7, and whose 
covariance is given by 



'^{'^Diji'^i )kl — ^irir' ^ij.kl + Stttt' Eij,kl 



n 

\P—7T,Tt' 



a.b 



Similarly to (6.121, we find using the definition of U and W that the two last lines are asymptotic to 
w^^w^_ Thus we get 



Tren 



This concludes the proof. 



(9.13) 

□ 



In order to conclude the proof of Proposition 9.3 we finally consider the general case (generalization of 
SectionlTl). As in Proposition 7.2 in the general case we get a deterministic shift ©^ren'^'^' where 



(|rf.| + l)(|d.|-l)^/^ ^ 



(9.14) 



The proof is similar to those of Lemma 7.1 and Proposition 7.2 We take over the setup and notation from 
Section[7]up to, but not including. (7.1 1. For each tt G 11 we define the spectral parameter := Oj; + iiV^''^ 
and the \t:\ x |7r| matrix 

N^'Wd^\-lY'^{V*R{z^)V -m{z^))^^^, (9.15) 

we well as the |[n]| x |[n]| block diagonal matrix xr ■— ®,rgn^fl- "^^^ quantities xs and xt are defined 
analogously with R replaced by S and T respectively. The following is the main comparison estimate, which 
generalizes Lemma |7.1[ 



Lemma 9.6. Provided p is a large enough con stant, the following holds. Let f G C^{c\™^\™) be bounded 
with bounded derivatives and q = qjy be an arbitrary deterministic sequence of |[n]| x |[n]| matrices. Then 

E/(xT + 9) = Ef{xR + q)+ ^ Z^if^E-^{xR + q)+Aab + 0{^-^£ab), (9.16) 



Ef{xs+q) = Ef{xR + q)+Aab + 0{^~^£ab). 



(9.17) 



where Aab satisfies \Aab\ ^ 'fi ^ , the error term Sab is defined in (7.4), and Z^""^^ is the |[n]| x |[n]| block 
diagonal matrix 0„gn Z'^'^''''^'' with \tt\ x |7r| blocks 



-N-Wd^\ - 1)1/2 (m(zOVi6V„14j ^ m(z,)Vi'V6,K,) (*, J e tt) 



z{ab) .IT 



Proof. The proof of Lemma [7.1| may be taken over almost verbatim, following the proof of Lemma 7.13 
of □ 

The comparison estimate from Lemma |9 .6| yields the shift described by S. The precise statement is given 
by the following proposition, which generalizes Proposition |7.2| 
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Proposition 9.7. For large enough K we have 



where was defined in (9.14). 



Proof. As in the proof of Proposition |7.2[ we follow the proof of Theorem 2.14 in Section 7.4 of |21 . The 

□ 



inputs are Proposition |9.5| and Lemma [976| 



Now Proposition 
proof of Proposition 



9^ 
93 



follows immediately from Proposition 9.7 using = + 5^. This concludes the 



A. Near-degenerate perturbations 

In this appendix we record some basic results on the perturbation of near-degenerate spectra. 

Proposition A.1. Let A and B he nonzero Hermitian matrices on . Let n + rn = N, so that 
C" © C™, and assume that A and B are of the form 

V A22) ' \B2i 

(in self-explanatory notation). Define the spectral gap 

A := dist(a(Aii),a(A22)) , 

and assume that A > 3||i?||. 
Define the domain 

V := {^e C :dist(M,CT(Aii) < 2||S||}. 
Then A + B has exactly n eigenvalues fix ^ ... ^ fin in V (counted with multiplicity) , which satisfy 

||^||2 

l/Zi - Aj(Aii)| ^ _ ^pii (i = l,...,n). 

Proof. The eigenvalue-eigenvector equation reads {A + B)x = fix. Writing x = (xi,X2) G C" ® C™ leads 
to the system 

AllXi -I- i3i2X2 = /LtXi , A22X2 -I- i32lXl = fIX2. (A.l) 

By assumption, for /i G 2? we have 

dist{fi,a{A22)) ^ A-2||B||. (A.2) 
Since A — 2||i3|| ^ ||i3|| > 0, we find that ( |A.l ) is equivalent to the system 



X2 = -(^22 -m) ^-621X1, AiiXi - ^xi - ^12(^22 - ^-621X1 = 0. 
Replacing B with tB for t E [0, 1], we conclude that for fi E T> we have the equivalence 

fieaiA + tB) ^ ftiii) = 0, 

where 

ft{fl) := det(Aii-/i-t'Si2(A22-Ai)"'S2l)- 



Moreover, from Lemma A.2 below we find that V contains exactly n eigenvalues of A + tB, for all t G [0, 1]. 
It is well known that the eigenvalues iii{t) oi A ^ tB are continuous in t. We now claim that each such 
continuous fii{t) is in fact Lipschitz continuous with Lipschitz constant 

, ^ m? 

■ A-2||Sir 
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Assuming this is proved, the claim immediately follows from j/i^ — A^j = — ^ L. 

In order to prove the Lipschitz continuity of iJ,i(t), note that fJ-i{t) is an eigenvalue of the matrix 



Then the Lipschitz continuity of fii{t) follows readily from Lemma A. 2 below and the estimate 

\\Bi2{A22~ ^i^{t))-^B2l\\ ^ L, 



as follows from (A.2), the fact that fii{t) e V for all t G [0, 1], and the fact that A22 is Hermitian. □ 



Lemma A. 2. Let A and B be Hermitian matrices. Then the spectrum of A + B is contained in the closed 
\\B\\ -neighbourhood of the spectrum of A. 

Proof. Using the identity (A+B-z)"^ = (yl-z)"^ (l + 5(^4 - z)"^)""^ we conclude that if dist(z, (t(A)) > 
||B|| then z ^ aiA + B). □ 
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